VIDEO PLAYBACK UNIT , VIDEO DELIVERY UNIT AND RECORDING MEDIUM 

BACKGROUND OF THE INVENTION 
Field of the Invention 

The present invention relates to a video playback unit 
of one or more than plurality of videos, a delivery unit 
and a recording medium. 
Description of the Related Art 

First, one example of a conventional unit capable of 
playing back and retrieving one video will be described with 
reference to FIG. 25. In FIG. 25, a video playback portion 
120 reads video data in order from a storage unit 110 by 
control of a controller 100, and outputs a screen output 
signal, an audio output signal and a current playback time 
on a display 130 . Further, the video playback portion takes 
a length of the entire video as a playback length P and outputs 
it to a playback starting time setting portion 101. When 
the time to start a playback is set by the playback starting 
time setting portion 101, the playback starting time setting 
portion 101 outputs a setting time (q) to the video playback 
portion 120. When the setting time (q) is inputted, the 
video playback portion 12 0 takes the video data of recording 
position information in the vicinity of the setting time 
from the storage unit 110 and performs the playback thereof. 
The display 130 takes a screen output signal , an audio output 
signal and a current playback time and performs the 
displaying of the video output, the audio output and the 



current playback time. 

Another example of the conventional unit will be 

described with reference to FIG. 26. This conventional 

example is, as shown in the drawing, constituted by a client 
5 terminal 105 as the playback unit and a server system 102 

connected by a network 104, and the video data are stored 

in the server system 102. 

The controller 100 sends a control signal to the video 

playback portion 120 , and requests the video playback portion 
10 12 0 to play back a video. The video playback portion 120 

sends a video request signal (r) to a server system 102 via 

a network 104. 

The server system 102 reads the request signal (r) by a video 
server 103, and sends designated video data to the video 
15 server 103 from the storage unit 110. The video server 103 
transmits the video data to a client terminal 105 via the 
network 104. 

The client terminal 105 stores the transmitted video 
data once in a buffer 106, and transfers it to the video 
20 playback portion 120 . The video playback portion 120 reads 
the video data and outputs the screen output signal, the 
audio output signal and the current playback time to a display 
13 0. Subsequent retrieval actions by using a playback 
starting time setting portion 101 are the same as those of 
25 the above described FIG. 25 and, therefore, the description 
thereof will be omitted. 

The method of setting a playback starting time ( setting 
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time (q)) for the playback starting time setting portion 
101 as in FIG. 25 and FIG. 26 has conventionally used the 
following methods of ( 1 ) , ( 2 ) . ( 1 ) The method of directly 
inputting the playback starting time. (2) The method of 
5 taking a length of the entire video as 100% and determining 
a designated ratio within that length as the playback 
starting time. 

However , s ince t he s et t ing methods ( 1 ) , ( 2 ) o f t he above 
described playback starting time do not reveal what sort 
10 of videos are stored at which position of the video data, 
an user is obliged to set several sorts of times by a time 
code or a scroll bar and, by playing back according to those 
setting times, to play back his target videos. For this 
reason, there was a problem that the retrieval of the videos 
15 took a lot of times and labors. 

Next, one example of the conventional unit for 
simultaneously playing back a plurality of videos is shown 
in FIG. 27. By control of the controller 100, the video 
playback portions 101 to lOn take in order a plurality of 
20 video data 1 to n from the storage unit 110, and input them 
to video playback portions 101 to lOn. The video playback 
portions 101 to lOn reads in order the video data and outputs 
the screen output signal, the audio output signal and the 
current playback time to the display 130. The display 130 
25 takes the playback time, the screen output signal and the 
audio output signal and performs the display of the playback 
time, the screen output and the audio output. 



This conventional unit has a demerit that, since a 
processing load corresponding to the number of video data 
is imposed, the number of videos to be playing back is limited 
when the playback processing is performed by, for example, 
a software. 

The conventional unit for simultaneously playing back 
a plurality of videos and playing back from a designated 
time is shown in FIG. 28. For example, when a length of the 
entire video from the video playback portion 101 is inputted 
to a playback starting time setting portion 201 as a playback 
length, the playback starting time setting portion 201 sets 
a time for performing the playback and outputs the setting 
time to the video playback portion 101. By so doing, the 
video playback portion 101 sets the recording position 
information of the video data in the vicinity of the setting 
time, and takes the video data from a designated recording 
position from the storage unit 110 and performs the playback 
thereof. 

The setting method of the playback starting time at 
the playback starting time setting portion 201 uses the 
following two methods: 

( 1 ) The method of directly inputting the playback starting 
time. 

(2) The method of taking the entire length of video data 
as 100% and determining a designated ratio within that length 
as the playback starting time. 

Since this conventional unit does not reveal what sort 



of videos are stored at which position, the user is obliged 
to set several sort of times and performs the playback from 
those times onward by way of trial so as to find his target 
videos and play back the videos of that target. For this 
reason, there was a problem that the retrieval of the videos 
took a lot of times and labors . 

The other conventional unit includes those where the 
above described storage unit 110 is provided within the 
server system connected via a network, and the videos 
necessary for playback are transmitted to a plurality of 
video playback portions via the network. However, in the 
case of this conventional unit, a bandwidth necessary for 
the transmission of videos depends on the number of videos, 
there was a problem that a line for high bandwidth is necessary 
for simultaneous network transmission. 

SUMMARY OF THE INVENTION 

An object of the present invention is to provide a video 
playback unit and a delivery unit of videos capable of 
effectively browsing video scenes contained in video files 
stored in the storage unit and video files within the server 
network-connected, or effectively retrieving the target 
scenes . 

Another object of the present invention is to provide 
the video playback unit and the delivery unit of a plurality 
of videos capable of s iimiltaneously playing back video files 
stored in the storage unit and a plurality of video files 



within the server network-connected without increasing a 
processing load of playback terminals, transmitting and 
browsing a plurality of videos even within a limited network 
bandwidth, or browsing, retrieving and playing back the video 
5 scenes contained in the videos. 

Still another object of the present invention is to 
provide a recording medium capable of reading with a computer 
which records the program capable of being played back or 
performing delivery processing. 
10 in order to achieve the above described object, a first 

characteristic of present invention is that the video 
playback unit comprises: video playback means for reading 
in a designated video file and outputting in order to play 
back the video of the video file; scene description file 
15 read- in means for reading in the scene description file which 
describes the scene inside the video file; means for 
outputting the time information sequence existing before 
and after the playback time of the video within the time 
information described in the scene description file; means 
20 for outputting the still image sequence corresponding to 
the time information displayed, wherein the still image is 
described in the scene description file; means for renewing 
the display of the time information sequence and the still 
image sequence by synchronizing with the playback time of 
25 the video; and display means for displaying the above 
described video, time information sequence and still image 
sequence. 



According to this characteristic, the scene 
description information can be displayed in step with the 
playback of the video, and the retrieval and browsing before 
and after the video that is being played back can be 
effectively performed. 

Further, a second characteristic of the present 
invention is that it comprises : video description file 
processing means for reading in a video description file 
of a designated video group; main video playback means for 
playing back a first main video file designated by the video 
information described in the video description file; proxy 
video file playback means for playback a second proxy video 
file designated by the video information described in the 
video description file; and the display means for displaying 
the first main video and the second proxy video played back 
by the main video playback means and proxy video playback 
means, wherein the above described proxy video file is made 
smaller in a file size or in a coded bit rate in contrast 
to the above described main video file. 

According to this characteristic, since a plurality 
of videos are constituted by the main video and the proxy 
video and played back, a plurality of videos can be 
effectively played back even in a limited transmission 
bandwidth or decoding capacity. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG.l is a block diagram showing a schematic 



constitution of a first embodiment of the present invention; 

FIG. 2 is a conceptual illustration of videos stored 
in a storage unit of FIG.l; 

FIGS . 3A, 3B are views showing display examples of videos 
displayed on a display of FIG.l; 

FIGS.4A, 4B are views showing the display examples of 
videos displayed on a display of FIG.l; 

FIG. 5 is a flowchart for explaining one example of the 
function of a time information sequence output portion of 
FIG.l; 

FIGS.6A, 6B are explanatory drawings of specific 
examples of a time information sequence and a still image 
sequence; 

FIG. 7 is a block diagram showing a schematic 
constitution of a second embodiment of the present invention ; 

FIG. 8 is a view showing a display example of the video 
displayed on the display of FIG. 7; 

FIG. 9 is a block diagram showing a schematic 
constitution of the third embodiment of the present 
invention; 

FIG. 10 is a block diagram showing the schematic 
constitution of a fourth embodiment of the present invention ; 

FIG. 11 is a block diagram showing a modified embodiment 
of the fourth embodiment; 

FIG. 12 is a block diagram showing the schematic 
constitution of a fifth embodiment of the present invention; 

FIG. 13 is a block diagram showing the schematic 



constitution of a sixth embodiment of the present invention; 

FIGS.14A, 14B are views showing display examples of 
a main video and a proxy video; 

FIG. 15 is an explanatory drawing of a first abstracted 
video ; 

FIG. 16 is an explanatory drawing of a second abstracted 
video; 

FIG. 17 is an explanatory drawing of a third abstracted 
video ; 

FIG .18 is a block diagram showing the schematic 
constitution of a seventh embodiment of the present 
invention: 

FIG. 19 is a block diagram showing a modified embodiment 
of the seventh embodiment; 

FIG. 2 0 is a block diagram showing the schematic 
constitution of an eighth embodiment of the present 
invention; 

FIGS.21A, 2 IB are explanatory drawings of a switching 
processing of the main video and the proxy video; 

FIG. 22 is a block diagram showing the schematic 
constitution of a ninth embodiment of the present invention; 

FIG. 23 is a block diagram showing the schematic 
constitution of a tenth embodiment of the present invention; 

FIG. 24 is a block diagram showing the schematic 
constitution of an eleventh embodiment of the present 
invention; 

FIG. 25 is a block diagram showing one example of a 



conventional unit; 

FIG. 26 is a block diagram showing another example of 
the conventional unit; 

FIG. 27 is a block diagram showing another example of 
the conventional unit; and 

FIG. 28 is a block diagram showing another example of 
the conventional unit. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 

FIG.l is a block diagram showing a schematic 
constitution of a first embodiment of the present invention. 
The video file stored in a storage unit 1, as shown in FIG. 2 
for example, comprises at least a video data (containing 
audio data), a still image sequence data V (tn) = Pn and 
a time information sequence data (tn) showing the time of 
the still image sequence data. 

A video playback portion 2 takes in order the video 
data of the video file designated by control of a controller 
7 from the storage unit 1 , and outputs a screen output signal , 
an audio output s ignal and a current playback time to a display 
6 . The display 6 takes the current playback time, the screen 
output signal and the audio output signal, and performs the 
display of a playback time, the display of the screen output 
and the audio output . The video playback portion 2 can play 
back the video from the time decided, for example, by time 
information selected within at least one of the time 
information sequence and the still image sequence to be 
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described later. For example, in the display screens of 
FIGS . 3A, 4A to be described later, if any one from a time 
information sequence T or a static screen sequence P is 
selected by a mouse and the like, the playback can be started 
5 from that selected time or that video of the static screen. 

A scene description file read-in portion 3 can read 
a time information sequence data positioned before and after 
the current playback time from the time information sequence 
O file stored in the storage unit 1 by control of the controller 

ni 10 7 . The scene description file read-in portion 3 can also 

ru 

J} read the still image sequence data corresponding to the time 

to 

* information sequence data from a still image file stored 

a 

fU in the storage unit. 1. The scene description file read-in 

ru 

m portion 3 can read in the above described time information 

O 

rU 15 sequence data and the above described still image sequence 

data of the predetermined number of pieces. 

The current playback time (Tv) obtained in the video 
playback portion 2 is inputted to the scene description file 
read-in portion 3 and a time information sequence output 
20 portion 4. The scene description file read- in portion 3 
take the time information sequence data and the still image 
sequence data from the storage unit 1 based on the current 
playback time information (Tv) , and inputs a time information 
sequence data (a) to the time sequence information sequence 
25 output portion 4 and a still image sequence data (b) to a 
still image sequence output portion 5, and displays the time 
information sequence and the still image sequence on the 
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display 6. Incidentally, assuming that the playback time 
of the video data is taken as (To), since the relationship 
between the time information sequence data (tn) in the video 
file and the current playback starting time (Tv) is tn = 
Tv-To, the scene description file read-in portion 3 can take 
the time information sequence data and the still image data 
from the storage unit 1 by using this relationship. 

Display examples of the display 6 are shown in FIGS.3A, 
3B and FIGS.4A, 4B. On the display 6, a screen output V 
(Tv) of the current playback time, the time information 
sequences T n _ 2 , T n -i, T n , T n+ i, T n+2 and the still images P n -2, Pn-i, 
P n , P n +i, P n +2 are, for example, displayed as illustrated. 
Further, the time information sequence T and the still image 
P are displayed in relation to the screen output V (Tv) . 

The renewal of the display of the still image sequence 
P of the display 6 is performed at a point of time when (Tv) 
has passed T n+ i assuming that the video being played back 
is taken as V (Tv) at a point of the time (Tv) , the nth time 
information in the scene description file is taken as (Tn) , 
and the still image is taken as (Pn) in the examples of FIGS . 3A, 
3B. On the other hand, in the examples of FIGS.4A, 4B, at 
a point of time when the time (Tv) of the playback video 
has passed T n+5 , the whole of the time information sequence 
T and the still image sequence P are eliminated, and by 
displaying the time information sequence T and the still 
image sequence P which begin from a new time information 
T n+5 and a new still image P n+5 , the renewal of a display content 
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is performed. The renewal of this display content is just 
one example , and the renewal may be of other display contents . 

The renewal of the display content can be performed 
by the time information sequence output portion 4 . A renewal 
action of the display content of the time information 
sequence output portion 4 will be described with reference 
to a flowchart of FIG. 5. In step SI, the time information 
sequence output portion 4 receives a playback starting time 
(TO) and a current playback time (Tv) from the video playback 
portion 2 . In step S2 , it is determined whether the current 
playback time (Tv) realizes Tn ^ T < T n+ i (provided that 
Tn = tn + TO) or not for a time information (Tn) and T n+i 
a still image. When this determination is affirmative, the 
process advances to step S3 and displays still images P n _ 2 
to P n+2 . On the other hand, when the determination of step 
S2 is negative, the process advances to Step S4 and changes 
the display of the still images to P n _i to p n+3 . In step S5, 
it is determined whether the playback processing is completed 
or not and, when this determination is negative, the process 
advances to step S6, and 1 is added to n. Hereinafter, the 
same actions as described above are repeated until the above 
described step S5 becomes affirmative. 

Incidentally , as for the content of the time information 
sequence data (tn) , various description methods related to 
the videos can be used . As one example, a time code sequence 
being increased by a designated time step, a time code 
sequence of a top of a cutting point showing a change of 
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a video scene, a time code sequence of a key frame point 
showing a center of a scene, a time code sequence showing 
a changing point of an accompanied audio from an audible 
sound to a non-audible sound, a time code sequence showing 
the time when a specific effective sound such as clapping, 
laughter and the like were generated, a time code sequence 
showing the time when a specific video such as a telop, a 
CG video and the like were generated, a time code sequence 
designated arbitrarily by the user, or a time code sequence 
combining all those described as above can be used . Further , 
the starting time and the section length of each scene can 
be used as the time information sequence. 

FIG.6A shows an example where video scene switching 
times tl, t2, • • • are used as the time information sequence 
and videos Pi, P2, ■ • - at those times are used as the still 
image sequence. FIG.6B shows an example where key frame 
point times tl, t2, • • • in the scene are used as the time 
information sequence and the videos Pi, P2 • • ■ at those 
times are used as the still image sequence. 

Incidentally, in the case where the scene description 
file is read into the scene description file read-in portion 
3 from the storage unit, there are two types of read-in methods 
available where the whole of the scene description file is 
read in and where the scene description file only 
corresponding to the fixed time information sequence and 
still image sequence is read in. In the former case, there 
is a problem that it takes a plenty of reading time since 
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all the information is stored, while in the later case there 
is a merit of being able to output the scene description 
file at high velocity once it is read. On the other hand, 
in the later case, though the reading time of the scene 
description file is short, it is necessary to read in the 
scene description information from the storage unit each 
time the display of the scene description information is 
renewed . 

Next, a second embodiment of the present invention will 
be described with reference to the block diagram of FIG. 7. 
This embodiment is such that the still image sequence related 
to the video playback is performed from the video file and 
the scene description file by using a command input. 

The video playback portion 2 takes the video data of 
the video file designated by the controller 7 in order from 
the storage unit 1 by the command from the command input 
portion 8 and inputs them to the video playback portion 2 . 
The video playback portion 2 takes the video data in order 
and outputs the screen output signal , the audio output signal 
and the current playback time on the display 6 . The display 
6 takes the playback time, the screen output signal and the 
audio output signal and performs the display of the playback 
time, the screen output and the audio output. 

Further, when the time information is inputted to the 
command input portion 8, a designated time information is 
inputted to the video playback portion 2 via the controller 
7, and the playback is started from the designated time. 
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Further, the playback time information (Tv) is inputted to 
the time information sequence output portion 4 similarly 
to the first embodiment and, accompanied with an elapse of 
time, the time information sequence and the still image 
sequence information are renewed, and new time information 
sequence and still image sequence are displayed on the 
display 6. 

As for the command input method, various input methods 
are conceived. One example is a method of directly 
designating the time information, whereby the video playback, 
the display of time information sequence, and the still image 
sequence are performed by using the inputted time by the 
user. Another method is to designate a time information 
in the displayed time information sequence, or to designate 
a still image in the still image sequence and input the 
designated time or still image. Further, it is possible 
to perform a time designation by using means for designating 
the time information sequence before and after and the time 
information before and after such as skip buttons 20, 21, 
22, 23 as shown in FIG. 8. 

For example, in FIG. 8, though the time information 
sequences up to times T n _ 2 to T n+2 and the still image sequence 
(P n _ 2 to P n+2 ) are displayed, the skip button 20 can designate 
the movement of these sequences to five former time 
information sequences (T n _ 3 from T n _ 7 ) and the still image 
sequences (P n -3 from P n -7) in point of time. Further, the 
skip button 21 can designate the movement of these sequences 
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to one former time information sequences (T n _ 3 to T n+1 ) and 
the still image sequences (P n -3 to P n +i) in point of time. 

Next, a third embodiment of the present invention will 
be described with reference to the block diagram of FIG. 9. 
The scene description file read- in portion 3 performs reading 
from the storage unit 1 via a cache memory 9. The cache 
memory 9 reads in order the scene description information 
to be displayed next to the scene description information 
being displayed on the current display 6 from the storage 
unit 1 . The actions thereafter are the same as those of 
the first embodiment and, therefore, the description thereof 
will be omitted. 

When the scene description information file capacity 
is large, a large bandwidth is required each time the scene 
description information is read and it takes a plenty of 
times until the next scene description information is read. 
In addition, there is a possibility of inhibiting the 
bandwidth to read in the video data from the storage unit . 
However, when the cache memory 9 is used, the scene 
description information can be continuously displayed on 
the display 6 . Further , by continuously reading in the scene 
description information into the cache memory 9, a data 
transmission bandwidth can be made constant and it is, 
therefore, possible to continuously reproduce the video 
without inhibiting a bandwidth necessary for reading the 
video from the storage unit 1 . 

Next, a fourth embodiment of the present invention will 
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be described with reference to the block diagram of FIG. 10. 
Although the above described first and third embodiments 
were the retrievable video playback units for displaying 
the video data of the video file stored in the storage unit 
5 1, the time information sequence of the scene description 
file and the still image sequence information and the like 
on the display 6, the present embodiment provides a delivery 
unit which is a server capable of delivering the video data 
of the video file of the storage unit 1 , the time information 

10 sequence of the scene description file and the still image 
sequence information and the like via a network. 

In FIG. 10, the video file information designated by 
the command input portion 21 is inputted to a delivery 
controller 22 and, by control of the delivery controller 

15 22, the video data designated by the command input portion 
21 is taken in order from the storage unit 1 and is inputted 
to a video transmitting portion 23. At the same time, by 
control from the delivery controller 22, the scene 
description data related to the video designated by the 

20 command input portion 21 is taken in order from the storage 
unit 1, and is inputted to the scene description data 
transmitting portion 24. 

The video data and the scene description data are read 
from the video data transmitting portion 23 and the scene 

25 description data transmitting portion 24 at respective 
characteristic data transmitting rates Rl , R2 , and are 
inputted to a network transmitting portion 25, and each data 
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is delivered to a network 26. 

The video data transmitting rate Rl can be decided by 
calculating information content per unit time from a file 
size of the video file and a video playback time. Further, 
the scene description data transmitting rate R2 can be 
decided by calculating information content per unit time 
from the scene description file size and the video playback 
time . Another way capable of deciding the transmitting rate 
is to calculate information content per unit time from the 
time to be displayed finally in the scene description file 
and the scene description file size. 

Further, the command input portion 21, as shown in 
FIG. 11, receives the input from a terminal connected via 
a network 27 at a network receiving portion 2 8 and may take 
that information as a command input. By so doing, at that 
terminal, the same retrieval as that of the first embodiment 
can be performed. 

Next, a fifth embodiment of the present invention will 
be described with reference to the block diagram of FIG. 12 . 

In FIG. 12, when the video file is designated at the 
command input portion 31 of the terminal 30, the designated 
video file information is inputted to a controller 32 and 
is outputted to a network transmitting portion 33. The 
designated image file information is inputted to a network 
receiving portion 2 8 of the delivery unit (server 20) from 
the network transmitting portion 33 via the network 27, and 
is subjected to a command input processing through the 



command input portion 21 . The video data designated by the 
delivery controller 22 is taken in order from the storage 
unit 1 , and is inputted to the video data transmitting portion 
23. Similarly, the scene description data related to the 
video designated at the command input portion by control 
from the delivery controller 22 is taken in order from the 
storage unit 1 and is inputted to the scene description data 
transmitting portion 24. 

The video data and the scene description data are read 
from the video data transmitting portion 23 and the scene 
description data transmitting portion 24 at respective 
characteristic data transmitting rates Rl, R2 , and are 
inputted to the network transmitting portion 25, and 
respective data are delivered to the network . The delivered 
video data and scene description data are inputted to a 
network receiving portion 34 of the terminal 30 via network 
27 and temporarily stored in a cache memory 35. 

The cache memory 35 reads in the video information 
currently displayed on the display 6 , the video information 
to be displayed next to the scene description information 
and the scene description information in order from the 
network. The actions thereafter are the same as those of 
the first embodiment. 

In general , when the scene description information file 
capacity is large, a large bandwidth is required each time 
the scene description information is read and it takes a 
plenty of times until the next scene description information 
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is read. In addition, there is a possibility of inhibiting 
the bandwidth to read the video data from the storage unit . 
However, when the cache memory 35 is used, the scene 
description information can be continuously displayed on 
the display 6 . Further, by continuously reading in the scene 
description information into the cache memory 35, a data 
transmission bandwidth can be made constant and it is, 
therefore, possible to continuously play back the video 
without inhibiting a bandwidth necessary for reading the 
video from a video delivery unit network-connected. 

Furthermore, when the time information is designated 
from the command input portion 31, the designated time 
information is inputted to the delivery controller 22 via 
the controller 32 , the network transmitting portion 33 , the 
network 27, the network receiving portion 2 8 and a command 
input portion 2 1 . Subsequently , the video data and the scene 
description data from the designated time are taken in order 
from the storage unit 1 , and are inputted to the video data 
transmitting portion 2 3 and the scene description data 
transmitting portion 24. These data are inputted to the 
video playback portion 2 and the scene description file 
read-in portion 3 through the network transmitting portion 
25, the network 27, the network receiving portion 34 and 
the cache memory 35, and the playback of the video from the 
above described designated time is started. The playback 
time information (Tv) is inputted to the scene description 
file read-in portion 3 and the time information sequence 



output portion 4 from the video playback portion 2, and the 
time information sequence and the still image sequence 
information are renewed and new time information sequence 
and the still image sequence are displayed on the display 
6. A display state of the display 6 is the same as FIG.3A, 
FIG.4A. 

As is clear from the above description, according to 
each of the above described first to fifth embodiments , since 
the scene description information can be displayed to match 
the playback of the video, the retrieval and the browsing 
of the video before and after the video being played back 
can be effectively performed. 

Further, by selecting a necessary scene, for example, 
by using the skip buttons 20 to 23 of FIG. 8, the playback 
from that scene and the scene description information before 
and after that scene can be displayed and, therefore, the 
video can be effectively retrieved at a random access time. 

Furthermore, the video data and the scene description 
information can be delivered at a constant transmiss ion speed 
(rate) from the delivery unit or the server unit and it is, 
therefore, possible to smoothly perform the video playback 
and the video retrieval in the terminal network-connected. 

Next, the invention will be described below, wherein 
a plurality of video files are simultaneously played back 
without increasing the processing load of the playback 
terminal, or a plurality of videos can be transmitted and 
read or looked even in a limited network bandwidth. 



A sixth embodiment shown in FIG. 13 is such that a main 
video file and a proxy video file are simultaneously played 
back and displayed. By control of a controller 40, a video 
description file processing portion 44 reads a designated 
5 video file from a storage unit 41. Next, the video 
description file processing portion 44 reads in order a 
designated first main video file from the storage unit 41 
designated by the video description file and input it to 
a main video playback portion 42 . The main video playback 



W 10 portion 42 reads in order the main video data which is the 

rli first video data and outputs a screen signal. A display 

03 45 takes the screen output signal and the audio output signal 

O from the main video playback portion 42 and performs the 

pJ screen output and the audio output. 

m 

O 15 Further, the video description file processing port ion 



44 takes in order the proxy video file which is a second 
video designated by the video description file from the 
storage unit 41 and inputs it to the proxy video playback 
portion 43. The proxy video playback portion 43 reads in 

20 order the proxy video data and outputs the screen signal. 
The display 45 takes the screen output signal and the audio 
output signal from the proxy video playback portion 43 and 
performs the screen output and the audio output. 

As for the above described video description file format , 

25 it can be described by using SMIL (Synchronized Multimedia 
Integration Language) standardized by W3C (World Wide Web 
Consortium) and the like. 

- 2 3 - 



As one example of the video description file format, 
if it is, for example, constituted by the main video 1 and 
the proxy video 2, it can be described as follows: 
ID#M1, video size (HM1,VM1), display position (XM1,YM1) 
ID#2, video size (HS2 ,VS2 ), display pos it ion ( XS 2 , YS 2 ) 
ID#M1, file storing position #M1, file name #M1 
ID#S2, file storing position #S2, file name #S2 

As for the output position of the proxy video , the proxy 
video 2 can be displayed inside the screen of the main video 
as shown in FIG.14A, or the proxy video 2 can be displayed 
outside of the screen of the main video 1 as shown in FIG. 14B. 

Here, in contrast to the main video file, the proxy 
video file can use a file being small in a video size or 
an encoded bit rate. In this way, the processing load 
necessary for playing back the proxy video can be reduced 
in contrast to the processing load necessary for playing 
back the main video. 

Furthermore, as the proxy video file, an abstracted 
video having a short playback time in contrast to the main 
video file can be used. As the abstracted video, an 
abstracted time video such as shown in FIG. 15, an abstracted 
shot video such as shown in FIG. 16 and an abstracted still 
image video such as shown in FIG. 17 can be used. 

In the case of the abstracted time video, in FIG. 15, 
first, the main video is divided into time intervals Tl. 
Next, regarding each divided section Tl, a section T2 only 
of the top of each section is, for example, extracted, and 



the video combining these sections T2 is used as the 
abstracted time video. 

In the case of the abstracted shot video, in FIG. 16, 
first, the main video is divided into shots which are changing 

5 units scene of the main video. As for the method of shot 
division, the cutting point detection method of the patent 
application by the present applicant, which is disclosed 
in Japanese Patent Laid-Open No. 11-252509, can be used. 
Next, regarding each shot SI, S2, • • •, sections Tl , T2 , • • ■ 

10 only of the top of each shot are, for example, extracted, 
and the video combining these sections is used as the 
abstracted shot video. Incidentally, as for the length of 
sections Tl, T2, • ■ • , it is possible to use a constant 
length or decide a length proportional to the shot length. 

15 Further, it is possible to combine several representative 
shots selected from the shots and take them as the shot videos . 

In the case of the abstracted still image video, 
in FIG. 17, first, a representative still image is selected 
from the main video. As for the selection method, the top 

20 videos and the like of the sections used by the abstracted 
time and abstracted shot can be used. Next, the abstracted 
still image video is formed by combining these still images . 
In this case, in order to display a representative still 
image for a constant period of time and switch over to the 

25 next still image, a dummy information D of certain time 
intervals is added and formed so that it can be used as a 
pseudo video. 



Further, the proxy video playback portion 43 in FIG. 13 
can use a playback processing system having a small 
processing load in contrast to the main video mage playback 
portion 42. For example, there is a method of reducing the 
processing load by reducing the number of playback processing 
frames by sampling the number of frames to be played back 
at constant intervals. Further, as disclosed in Japanese 
Patent Application No. 2000-00095 "Encoded Video Data 
Playback Unit And It's Storage Medium" by the patent 
application by the present patent applicant, there is also 
a method of decoding a part only of the video information 
at the time when a compressed data is decoded so as to reduce 
the processing load. 

Next, a seventh embodiment of the present invention 
will be described with reference to the block diagram of 
FIG. 18. 

This embodiment is characterized in that a main video file 
and a proxy video file are switched by using a command input . 

By control of the controller 40, the video description 
file processing portion 44 reads in the designated video 
description file from the storage unit 41 and displays the 
designated main video 1 and proxy video 2 as described in 
the sixth embodiment on the display 45. 

Next, by order from the command input portion 46, when 
the controller 4 0 performs the switching control of the main 
video and the proxy video, the proxy video file information 
is read in the video description file processing portion 
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44 as the main video file (hereinafter, referred to as new 
main video file) and, conversely, the main video file 
information as the proxy video file (hereinafter, referred 
to as new proxy video file). Subsequently, the video 
description file processing portion 44 takes in order the 
new main video file and the new proxy video file from the 
storage unit 41 and inputs them to the main video playback 
portion 42 and the proxy video playback portion 43, 
respectively. 

The main video playback portion 42 reads in order the 
main video data of the new main video file and outputs a 
screen signal, and the proxy video playback portion 43 reads 
in order the new proxy video data and outputs the screen 
signal. The display 45 takes the screen output signal and 
the audio output signal from the main video playback portion 
42 and the proxy video playback portion 43 and performs the 
screen output and the audio output. 

incidentally, the setting of the playback starting time 
at the time of switching from the playback of the main video 
to the playback of the proxy video can utilize several setting 
methods. As one method, a starting time of each video is 
utilized as the playback time at the time of switching. As 
another method, the time of the main video of FIG. 19 is 
inputted to the controller 40 and that time can be taken 
as the playback starting time of the new main video so as 
to start the playback. 

Next , an eight embodiment of the present invention will 



be described with reference to FIG. 20. This embodiment is 
characterized in that the main video file and a plurality 
of proxy video files are switched by using the command input . 

By control from the controller 4 0 , the video description 
file processing portion 44 reads in the video description 
file from the storage unit 41, and the main video file (for 
example, the first video file) designated by the video 
description file is read in order from the storage unit 41 
to the main video playback portion 42, and the video is 
displayed on the display 45. Further, the proxy videos 1 
to n are read in order from the storage unit 41 to the proxy 
video playback portions 71 to 7n, and n pieces of the proxy 
videos are displayed on the display 45. 

Next, by control from the command input portion 46, 
when a certain proxy video file (for example, the mth proxy 
video file) is designated from the controller 40, in order 
to switch the main video, the information related to the 
mth main video file is read into the video description file 
processing portion 44, and the mth main video file is taken 
in order from the storage unit 41 and inputted to the main 
video playback portion 42 . The main video playback portion 
42 reads in order the mth main video data and display the 
video. In this case, the main video being played back is 
played back from one within the proxy videos being played 
back. The switching flowchart is shown in FIG.21A. 

As for the display method of these videos, in the same 
way as in FIG. 14, there are methods such as displaying the 
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n pieces of proxy videos inside the screen of the main video 
or displaying the proxy video outside of the screen of the 
main video. 

By sufficiently reducing the processing load necessary 
for playing back the n pieces of the proxy videos, one piece 
of the main video and the n pieces of proxy videos can be 
played back even for the playback unit having a limited 
processing capacity. 

As a modified example of the eighth embodiment, in the 
same way as in FIG. 8, there is a video switching method as 
follows. By control from the controller 40, the video 
description file processing portion 44 reads in the video 
description file from the storage unit 41 , and the main video 
file (for example, the 0th video file) designated by the 
video description file is read in order from the storage 
unit 41 to the main video playback portion 42 and is displayed 
on the display 45. Further, the proxy video files 1 to n 
of the first to the nth video files described in the video 
description file are read in order from the storage unit 
41 to the proxy video playback portions 71 to 7 n, and n 
pieces of the proxy videos are displayed on the display 45 . 

Next, by control from the command input portion 46, 
when a certain proxy video file (for example, the mth 
proxy video file) is designated from the controller 40, the 
mth main video file is played back in the main video playback 
portion 42 and, at the same time, the information related 
to the Oth proxy video file is read into the video description 
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file processing portion 44, and the Oth proxy video file 
is read in order from the storage unit 41 to the mth proxy 
playback portion so that the playback of the mth proxy video 
can be switched to the playback of the Oth proxy video, in 
this way , the proxy video different from the main video being 
always played back can be play back, and video retrieval 
efficiency can be enhanced. A switching flowchart of the 
above described processing is shown in FIG.21B. 

Next, a ninth embodiment of the present invention will 
be described with reference to FIG. 22. This embodiment is 
characterized in that the main video, the scene image of 
the main video and one or a plurality of proxy videos are 
displayed. 

In FIG. 22, by control from the controller 40 , the video 
description processing portion 44 reads in the designated 
video file from the storage unit 41. Next, the video 
description file processing portion 44 takes in order the 
main video file designated by the read video description 
file from the storage unit 41 and inputs it to the main video 
playback portion 42. The main video playback portion 42 
reads the main video data and outputs a screen signal. The 
display 45 takes the screen output signal and the audio output 
signal from the main video playback portion 42 and performs 
the screen output and the audio output. 

Further, the proxy video file designated by the video 
description file is taken in order from the storage unit 
41 and is inputted to the proxy video playback portion 43. 
-30- 



The proxy video playback portion 43 reads in order the proxy 
video data and outputs the screen signal. The display 45 
takes the screen output signal and the audio output signal 
from the proxy video playback portion and performs the screen 
output and the audio output. 

Further, by control of the controller 40, a scene 
description file read-in portion 91 inputs the scene 
description file corresponding to the video file from the 
storage unit 41. 

The current playback time obtained by the main video 
playback portion 42 is inputted to the scene description 
file read-in portion 91 and the time information sequence 
output portion 93. The scene description file read-in 
portion 91 takes the time information sequence data and the 
still image sequence output data from the storage unit 41 
based on the inputted current playback time information, 
and inputs them to the time information sequence output 
portion 93 and the still image sequence output portion 94, 
and displays the time information sequence and the still 
image sequence on the display 45. As for the main video 
and the proxy video, the display method such as those of 
FIG.14A, FIG.14B can be used. As for the main video and 
the time information sequence and the still image information 
sequence, the display method such as those of FIG. 3A, FIG.4A 
can be used. 

The renewal of the display content is performed at the 
time information sequence output portion 93 . The timing 



of the renewal of the display content is the same as those 
described in FIGS. 3A, 3B and FIGS . 4A, 4B and FIG. 5, therefore, 
the description thereof will be omitted. 

As for the time information sequence data to be inputted , 
5 the time information sequence data positioned before and 
after the current playback time can be read from the time 
information sequence file stored in the storage unit 41. 
As for the still image sequence data, the still image sequence 
corresponding to the read time information sequence data 
10 can be read from the still image file stored in the storage 
unit 41. As for the content of the time information sequence 
data, the description method as described in the first 
embodiment can be used. As for an example of the time 
information, the video scene switching time of FIG.6A and 
15 the key frame point time in the scene of FIG.6B can be used. 

Next, the ninth embodiment of the present invention 
will be described with reference to FIG. 23 . This embodiment 
is such that the video file information designated by a 
command input portion 55 is inputted to a delivery controller 
20 54 and , by control from the delivery controller 54 , the video 
data designated by the command input portion 55 is taken 
in order from the storage unit 41 and is inputted to a main 
video transmitting portion 51a and/or a proxy video 
transmitting portion 51b. Similarly, by control from the 
25 delivery controller 54 , the scene description data related 
to the video designated by the command input portion 55 is 
taken in order from the storage unit 41 and is inputted to 
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a scene description data transmitting portion 52. 

The main and/or proxy video data and the scene 
description data are read from the main video transmitting 
portion 51a, the proxy video transmitting portion 51b and 
the scene description data transmitting portion 52, 
respectively at the designated data rate and are inputted 
to a network transmitting portion 53 , and respective data 
are delivered to the network. 

The main video transmitting rate can be decided by 
calculating information guantity per unit time from the file 
size of the video file and the video playback time. As for 
the scene description data transmitting rate, it can be 
decided by calculating information quantity per unit time 
from the scene description file size and the video playback 
time. As for other method, the transmitting rate can be 
decided by calculating information quantity per unit time 
from the time to be displayed finally in the scene description 
file and the scene description file size. 

The command input portion 55 receives an input from 
the terminal connected via a network 5 7 at a network receiving 
portion 56 and may use that information as a command input. 

Next, an eleventh embodiment of the present invention 
will be described with reference to FIG . 24 . This embodiment 
is such that, first, by control from the controller 40, the 
video description file processing portion 44 reads in the 
designated video description file from the storage unit 41 
of a server 50 via a network 58 . Next, the video description 



file processing portion 44 outputs the main video file and 
the proxy video file designated by the read video description 
file to a network transmitting portion 61 via the controller 
40. The video file information designated by the command 
5 input portion 46 of the client unit (terminal unit) 60 is 
inputted to the controller 40 and is outputted to the network 
transmitting portion 61. 

The designated video file information is inputted to 
the network receiving portion 56 of the server 50 from the 
5 10 network transmitting portion 61 via the network 58, and a 

W command input processing is performed through the command 

m input portion 55. By control from the delivery controller 

O 54 , designated main video and proxy video are taken in order 

fU from the storage unit 41 and are inputted to the main video 

Q 15 transmitting portion 51a and the proxy video transmitting 

portion 51b, respectively. Similarly, by control from the 
delivery controller 54, the scene description data related 
to the video designated by the command input portion 46 is 
taken in order from the storage unit 41 and is inputted to 
20 the scene description data transmitting portion 52. 

The main, proxy video data and the scene description 
data are read from the main video transmitting portion 51a, 
the proxy video transmitting portion 51b and the scene 
description data transmitting portion 52, respectively at 
25 the designated data rate and are inputted to a network 
transmitting portion 53, and respective data are delivered 
to the network 58. The delivered video data, the proxy video 
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data and the scene description data are inputted to a network 
receiving portion 62 via the network 58 and are temporarily 
stored in a cache memory 63. 

From the cache memory 63, the main video information, 
5 the scene description information and the proxy video 
information are inputted to the main video playback portion 
42, a scene description information read- in portion 91 and 
the proxy video playback portion 43, respectively. The 



Q action of the video description file processing portion 44 

03 io is the same as the action of FIG. 2 2 and, therefore, the 

fy 

lU description thereof will be omitted. 

Cft 

£0 The cache memory 63 reads in order the video information, 

O the proxy video and the scene description information to 

W 

III be displayed next to the main video information, the proxy 

m 

q 15 video and the scene description information currently 



displayed on the display 45 from the server 50 via the network 
58. 

in general , when the scene description information file 
capacity is large, a large bandwidth is required each time 

20 the scene description information is read. Therefore it 
takes a plenty of times until the next scene description 
information is read. In addition, there is a possibility 
of inhibiting the bandwidth to read the video data from the 
storage unit. However, when the cache memory 63 is used, 

25 the scene description information can be continuously 
displayed on the display 45. Further, by continuously 
writing the scene description data into the cache memory 
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63, a data transmission bandwidth can be made constant and 
it is, therefore, possible to continuously play back the 
video without inhibiting a bandwidth necessary for reading 
the video from the delivery unit network-connected. 
5 As described above, according to the above described 

sixth to eleventh embodiments, since a plurality of videos 
are constituted by the main video and the proxy video and 
these videos are allowed to be played back, a plurality of 
videos can be played back even in a limited transmission 
10 bandwidth and decoding capacity. 

Further, the video scene contained in the video can 
be read and retrieved so as to play it back. 

Further, by displaying the description information in 
step with the playback of the video, the browsing and the 
15 retrieval before and after the video being played back can 
be effectively performed. 

Further, according to the present invention, regarding 
the video file inside the server network-connected, a 
plurality of video files can be played back without 
20 increasing the processing load of the playback terminal or 
the video can be transmitted and read even in a limited network 
bandwidth . 

Each of the first to eleventh embodiments is preferably 
realized by a personal computer, and the program of the 
25 processing of each embodiment can be recorded and provided 
in a computer readable program recording medium. The 
recording medium includes not only a portable type recording 



medium such as an optical disc, a floppy disc, a hard disc 
and the like, but also a transmission medium temporarily 
recording and holding data such as a network. 



